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Summary 

This  report  summarizes  the  five  year  SATA  (Stochastic  Algebraic  Topology  and 
Applications)  project  involving  three  US  investigators  together  with  Robert  Adler  and  his 
team  at  the  Technion  (Israel),  who  is  working  on  a  similar  grant  with  the  European  Office 
of  Aerospace  Research  and  Development.  We  have  not  always  been  able  to  separate  the 
work  done  by  the  American  group  from  the  Israeli  team. 

The  project  has  led  to  approximately  50  papers,  the  vast  majority  of  which  are  already 
published  or  submitted.  These  include  results  related  to  the  statistics  of  random 
functions,  random  complexes,  and  random  manifolds  and  embeddings.  In  addition,  there 
have  been  some  extensions  of  the  scope  of  the  project  to  include  some  new  problems  and 
areas  related  to  the  theme  (stochastic  algebraic  topology)  that  were  not  mentioned  in  the 
original  proposal  in  terms  of  connections  to  e.g.  order  statistics,  sample  complexity 
bounds,  and  topological  sampling  theory. 

This  grant  has  also  played  a  role  in  the  training  of  graduate  students  and  postdoctoral 
fellows,  dissemination  of  results  and  general  educational  activity  on  the  importance  of 
stochastic  algebraic  topology  as  tool  in  data  analysis. 
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Objectives, 

Recall  that  the  Stochastic  Algebraic  Topology  and  Applications  (SATA)  project 
aims  to  exploit  recent  advances  in  the  complementary  areas  of  topology  and  stochastic 
processes  to  tackle  a  wide  range  of  data  analytic  problems  of  broad  importance.  Treating 
data  topologically  is  crucial  in  scenarios  in  which  it  is  important  to  detect,  localize,  and 
perhaps  perform  an  initial  classification  of  objects  without  attempting  to  completely 
characterize  them.  Adding  a  stochastic  element  allows  for  the  almost  pervasive  situation 
in  which  the  data  itself  is  imperfectly  observed  due  to  the  presence  of  background  noise. 
As  current  probabilistic  and  statistical  methodology  is  ill  suited  to  detect  such  qualitative 
structures,  the  project  aims  to  develop  generic  stochastic  models  whose  topological 
structures  are  amenable  to  mathematical  analysis,  as  a  first  step  towards  implementation 
of  a  broader,  more  quantitative  program.  Core  topics  include  random  functions  on 
manifolds,  random  manifolds  created  by  random  embeddings,  and  random  manifolds 
arising  in  machine  learning,  along  with  their  theoretical  and  practical  interplay. 
Secondary  topics  include  the  analysis  of  associated  algorithms,  and  the  topological 
understanding  of  random  spaces  that  arise  in  particular  stochastic  models.  We  have  also 
studied  implementation  and  application  of  these  ideas  on  some  problems  coming  from 
engineering  and  physics. 


Accomplishments 

This  is  basically  a  mathematics-based  project,  and  the  methods  are  hard  analysis 
supplemented  with,  and  often  motivated  by,  computation.  Below  we  describe  three 
general  areas  in  which  we  have  made  significant  progress,  and  a  fourth  area  of  new  ideas 
that  were  generated  by  our  projects.  The  headings  all  correspond  to  topics  in  the  original 
proposal,  which  provides  background  material. 

The  research  in  this  project  grew  out  of  two  different  types  of  topological  data 
analysis:  statistical  and  geometric.  The  statistical  problems  were  from  the  theory  of 
Gaussian  (related)  random  fields,  and  use  topological  invariants  as  proxies  for 
measurements  of  more  direct  interest  (such  as  excursion  probabilities)  and  also  can  be 
used  as  robust  signatures  of  complicated  data  (such  as  time  series,  random  fields,  moving 
objects,  etc.)'.  The  second  source  was  geometric  —  trying  to  find  useful  geometries  of 
data  that  can  be  used  to  improve  data  analysis^. 

The  topological  sampling  problem  is  one  of  the  first  theoretical  problems  in 
topological  data  analysis:  assuming  that  data  is  being  sampled  from  a  probability 


A  convenient  source  for  this  is  the  book  “Random  Fields  and  Geometry”  by 
Adler  and  Taylor  and  the  draft  of  a  second  volume  (coauthored  with  K.  Worsley)  that  can 
be  found  on  Adler’s  web  page. 

^  An  overview  of  the  philosophy  of  this  work  can  be  found  in  G.  Carlsson’s  survey 
paper  in  Acta  Numerica  (Vol  23  (2014)  289-368.) 
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distribution  of  geometric  origin,  can  one  determine  the  underlying  geometry?  The 
literature  on  this  and  related  problem  has  grown  substantially  and  the  picture  is 
considerably  different  than  it  was  at  the  beginning  of  the  project. 

At  the  time  the  project  began,  a  number  of  papers  showed  that  if  a  distribution 
was  sampled  from  a  “well  conditioned”  manifold  with  “not  too  much”  noise,  then  with 
“enough”  samples,  with  high  probability,  one  could  recover  the  topology  (homotopy  type 
or  even  the  diffeomorphism  type)  of  the  manifold.  Different  and  deep  versions  of  this 
result,  and  new  algorithms  that  work  more  or  less  efficiently  in  different  contexts,  are  still 
being  discovered  and  progress  on  this  basic  problem  remains  important. 

Among  the  issues  that  our  work  has  illuminated  directly  related  to  topological 
sampling  are: 

(1)  The  theoretical  limits  on  when  topological  reconstruction  is  possible  [20]. 

(2)  The  rates  of  convergence  of  specific  algorithms  for  computation  of 
topological  invariants  (and  related  phase  transitions)  such  as  homology  and 
homotopy  [4,5,17,18]  or  Euler  integrals  [30]. 

(3)  Connections  to  percolation  theory  [51]. 

(4)  The  behavior  of  these  algorithms  on  pure  noise  (as  preparation  for 
understanding  better  noisy  data  sets)  [4,50,31]. 

(5)  Stability  properties  of  topological  invariants  of  functions  [16]. 

(6)  Lower  bounds  on  sampling,  Kolmogorov  and  logical  complexity  of  some  of 
these  reconstruction  problems  [47]. 

(7)  What  happens  to  (6)  in  the  generic,  as  opposed  to  worst  case,  situations  [7]. 
([7]  directly  solves  one  of  the  main  problems  identified  in  the  original 
proposal,  and  we  will  discuss  it  in  more  detail  below.) 

(8)  Reparametrization  invariant  functionals  on  time  series  [52,53]. 

(9)  Typical  shapes  of  discretized  loops  with  topological  constraints  [8]. 

(10)  Applications  of  critical  point  theory  to  statistical  inference  problems 
[12,19,21,24,25,26,27,28,34,35,36,37,38,40,41]. 

We  also  made  progress  in  furthering  the  applications  of  this  work,  developing 
new  topological  signatures  [44,  53],  and  have  initiated  work  on  finding  some  new 
topological  invariants  that  are  more  computable  than  the  traditional  ones.  Finally,  we 
mention  some  results  that  stem  from  our  topological  study  that  are  not  particularly 
stochastic. 

Below  we  describe  three  general  areas  in  which  we  have  made  significant  progress  and  a 
fourth  area  of  new  ideas  that  are  spurred  by  our  projects.  The  headings  all  correspond  to 
topics  in  the  original  proposal  for  ease  of  comparison,  which  provides  background 
material. 

1:  Statistics  of  random  functions 

From  a  mathematical  standpoint,  our  basic  setup  here  starts  with  a  base 
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topological  space,  M,  typically  but  not  always  a  manifold,  and  a  random  function  f  on  M, 
with  values  in  R"  for  some  n>  1 .  Given  a  realization  of  such  a  function,  we  proposed 
studying  various  statistics  pertaining  to  it,  including  properties  of  the  set  of  eritical  points, 
critical  values,  sub-  and  super-level  sets,  etc.  These  arise  in  both  themes  mentioned 
above. 

(i)  In  joint  work  of  Adler  and  Taylor  with  Eliran  Subag,  a  Technion  graduate  student, 
they  provided  a  new  approach,  along  with  extensions,  to  results  in  two  important  papers 
of  Worsley,  Siegmund  and  eoworkers  closely  tied  to  the  statistieal  analysis  of  fMRI 
(functional  magnetic  resonance  imaging)  brain  data.  These  papers  studied  approximations 
for  the  exceedence  probabilities  of  scale  and  rotation  spaee  random  fields,  the  latter 
playing  an  important  role  in  the  statistical  analysis  of  fMRI  data.  The  techniques  used 
there  eame  either  from  the  Euler  characteristic  heuristic  or  via  tube  formulae,  and  to  a 
large  extent  were  carefully  attuned  to  the  specifie  examples  of  the  paper.  In  [1]  they 
treated  the  same  problem,  but  via  caleulations  based  on  the  so-called  Gaussian  kinematic 
formula.  This  allowed  for  extensions  of  the  Worsley-Siegmund  results  to  a  wide  class  of 
non-Gaussian  cases.  In  addition,  it  allows  one  to  obtain  results  for  rotation  spaee  random 
fields  in  any  dimension  via  reasonably  straightforward  Riemannian  geometrie 
calculations.  Previously  only  the  two-dimensional  case  could  be  covered,  and  then  only 
via  computer  algebra. 

(ii)  The  paper  by  Adler,  Moldovskaya  and  Samorodnitsky  [2]  studied,  in  a  one 
dimensional  setting,  the  problem  of  whether  or  not  two  or  more  points  which  lie  in  an 
excursion  set  of  a  smooth  random  proeess  belong  to  the  same  connected  component.  This 
is  a  fundamental  problem,  at  the  level  of  eonneetivity,  that  has  eluded  successful  analysis 
for  a  number  of  years. 

Adler  and  Samorodnistky,  in  a  later  paper  [6],  take  this  mueh  further,  to  the  setting 
of  continuous  Gaussian  random  fields  on  higher  dimensional  Euclidean  spaces,  and 
address  the  question  of  how  likely  it  is  for  the  excursion  sets  to  have  a  "hole"  of  a 
certain  dimension  and  depth?  Answering  this  question  in  full  generality  appears  to  be 
impossible  at  the  moment,  but  their  paper  makes  signifieant  progress.  Specifieally,  they 
determine  how  likely  is  such  a  field  to  be  above  a  high  level  on  one  eompaet  set  (e.g.  a 
sphere)  and  to  be  below  a  fraetion  of  that  level  on  some  other  compact  set,  (e.g.  at  the 
eenter  of  the  corresponding  ball).  These  questions  have  clear,  and  sometimes  surprising 
and  eounter-inuitive,  answers  at  the  level  of  large  deviations. 

(iii)  Naitzat  and  Adler  [30]  proved  a  central  limit  theorem  for  the  Euler  integral  of  a 
Gaussian  random  field.  Recall  that  Euler  integrals  of  deterministic  functions  have 
recently  been  shown  to  have  a  wide  variety  of  possible  applieations,  ineluding  in  signal 
proeessing,  data  aggregation  and  network  sensing.  Adding  random  noise  to  these 
seenarios,  as  is  natural  in  the  majority  of  applieations,  leads  to  a  need  for  statistieal 
analysis,  the  first  step  of  which  requires  asymptotic  distribution  results  for  estimators. 

The  first  sueh  result  is  provided  in  this  paper,  as  a  central  limit  theorem  for  the  Euler 
integral  of  pure,  Gaussian,  noise  fields. 

Proving  these  results  turned  out  to  be  somewhat  more  complieated  than  was 
originally  expected.  Eortunately,  the  aetual  central  limit  theorem  is  simple  to  state  and. 
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equally  importantly,  simple  to  apply  as  an  inferenee  tool  in  real  life  seenarios.  On  the 
other  hand,  the  proof  required  a  sophistieated  manipulation  of  the  most  reeent  advanees 
in  eentral  limit  theorems  for  Gaussian  funetionals  based  on  representations  via  the 
Malliavin  ealeulus. 

This  work  assumes  additional  signifieanee  beeause  of  the  work  of  Baryshnikov  and 
Christ  relating  Euler  integrals  to  approaehes  to  target  enumeration.  These  authors  and 
Wright  [14]  develop  a  general  Hadwiger  theory  for  these  invariants,  whieh  helps  explain 
their  eentrality. 

(iv)  Baryshnikov  and  Weinberger  have  studied  the  generie  behavior  of  persistent 
homology  in  various  funetion  spaees  [16].  Unlike  the  usual  stability  theorems  that  assert 
that  long  bars  are  stable  with  respeet  to  small  perturbations,  the  nature  of  the  results  of 
this  paper  deseribe  the  stability  of  the  small  bars  with  respeet  to  smooth  but  possibly 
large  perturbations  (where  here  large  means  large  magnitude).  We  eall  this  kind  of 
information,  the  “jitter”  of  a  funetion  or  a  spaee.  For  a  simple  example,  f(x)  +  sin  nx 
will  have  many  loeal  minima  and  maxima  for  n  large,  whenever  f  is  a  Lipsehitz  funetion 
on  an  interval,  say  [a,b].  (And  this  number  grows  like  n(b-a),  i.e.  linearly  in  n  and  the 
length.) 

These  statisties  ean  give  information  about  underlying  meehanisms  for  data  and  the 
paper  diseusses  sizes  of  eraters  on  the  moon  and  stoek  priee  time  series  (where  the 
persistent  homology  is  eonsistent  with  a  Holder  14  funetion  —  although  more  subtle 
dendrogram  type  invariants  do  distinguish  these  from  a  time  reparametrised 
exponentiated  Brownian  motion  [52,  53]).  These  and  additional  theoretieal  results  about 
singularities  and  mathematieal  analogues  are  being  written  into  a  revised  version  of  the 
paper  on  jitter.  It  also  has  a  large-seale  aspeet,  as  well,  and  is  related  to  the  short  paper 
[47]. 


Note  the  strange  nature  of  the  integrand  (note  that  intervals  appear  in  it!^):  it  is 
essentially  a  eurrent  deseribing  the  average  number  of  times  one  should  see  an  interval 
approximately  of  length  [0,  x/2]  in  the  persistenee  diagram.  The  blow  up  near  the  origin 
is  beeause,  with  many  points,  there  are  very  many  very  short  persistenee  intervals.  Its 
quadratie  nature  is  perhaps  typieal.  Similar  effects  occur  for  Brownian  motion,  as 
mentioned  above.  The  nonzero  measure  assoeiated  to  long  intervals  is  essentially  a 
preeise  form  of  the  eraekle  phenomenon. 

(v)  Baryshnikov  [52,53]  initiated  a  study  of  reparametrization-invariant  funetionals  of 
time  series,  an  important  addition  to  the  standard  toolbox  of  data  anlysis,  almost  entirely 
relying  on  harmonie  analysis,  e.g.  Fourier  transform  in  its  different  avatars. 

One  direetion  deals  with  the  realization  that  the  Reeb  tree  that  ean  be  assoeiated  to 
a  sealar  univariate  funetion  earries  more  information  than  merely  its  bareode:  the  bars 
have  a  ehirality,  i.e.  ean  fall  or  raise.  Statisties  of  these  raising  or  falling  bars  ean,  signify 
in  a  reparametrization  invariant  way  the  asymmetry  of  the  proeess.  In  [52]  Baryshnikov 
studies  the  baseline  ease  of  various  Brownian  motions,  and  the  resulting  asymmetries. 


Some  might  prefer  the  intervals  replaeed  by  eharaeteristie  funetions  of  the 
intervals,  and  then  this  formula  ean  be  viewed  as  an  equality  in  a  spaee  of  measures. 
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On  the  empirical  side,  he  exhibited  time  irreversibility  in  several  classes  of  time  series 
Another  model  considered  by  Baryshnikov  deals  with  cyclic  but  not  necessarily  periodic 
processes  (such  as  business  cycles).  Motivated  by  a  classical  theorem  of  K.-T.  Chen,  he 
introduced  an  algorithm  to  recover  a  cyclicity  [53],  a  family  of  time  series  sequentially 
following  a  similar  periodic  pattern,  perhaps  with  a  desynchronized  clock.  The  algorithm 
relies  on  the  notion  of  iterated  integrals,  and  leads  to  remarkably  reliable  reconstructions 
of  cyclic  orderings  in  some  systems,  in  particular  in  the  relative  performance  of  industrial 
sectors  during  the  business  cycles  of  the  US  economy. 

2:  Morse  theory,  critical  points,  Betti  numbers  and  random  complexes 

As  described  in  the  original  proposal,  we  planned  to  invest  considerable  effort  in 
the  study  of  the  topological  properties  of  random  simplicial  complexes  for  hypothesis 
testing. 

Adler  and  Bobrowski  [5]  considered  for  a  finite  set  of  points  P  in  the  behavior 
of  the  number  of  critical  points  of  the  distance  function  dp  :  R“  ^  R^  which  measures 
Euclidean  distance  to  the  set  P.  In  particular,  they  studied  the  number  of  critical  points  of 
dp  when  P  is  a  random  sample  from  a  given  distribution,  and  the  limit  behavior  of  Nk  = 
the  number  of  critical  points  of  dp  with  Morse  index  k,  as  the  number  of  points  in  P  goes 
to  infinity.  They  gave  explicit  computations  for  the  normalized,  limiting,  expectations 
and  variances  of  the  Nk,  as  well  as  distributional  limit  theorems.  These  results  are  related 
to  recent  results  of  Kahle  in  which  the  Betti  numbers  of  the  random  Cech  complex  based 
on  P  were  studied.  The  practical  implication  of  these  results  lies  in  the  design  of  sampling 
algorithms  for  manifold  learning  via  approximating  simplicial  complexes. 

Similar  ideas,  applied  to  a  different  regime,  were  used  by  Bobrowski  and 
Weinberger  [18]  to  discover  the  phase  transitions  in  the  computation  of  homology  of 
Riemannian  manifolds  from  Cech  complexes,  at  least  in  the  case  of  fiat  tori.  They  gave  a 
heuristic  indicating  that  the  same  results  should  apply  in  general,  but  did  not  give 
quantitative  results  about  the  rate  of  convergence.  This  is  ongoing  work. 

On  another  front,  Adler  and  Yogeshwaran  [50]  have  studied  random  complexes 
(generally  Cech  or  Rips)  generated  from  point  clouds  in  settings  where  the  underlying 
point  process  is  neither  Poisson  nor  a  simple  random  sample,  but  comes  from  a  general 
stationary  process  in  which  there  may  be  considerable  correlations  (either  positive  or 
negative)  between  different  regions.  A  typical  example  of  significant  current  interest 
from  both  theoretical  and  applied  points  of  view  is  given  by  determinantal  point 
processes.  Their  surprising  finding  is  that  many  of  the  results  from  the  better  known,  and 
simpler,  scenarios,  while  they  do  carry  over  in  principle  to  the  correlated  situation, 
involve  quantitative  differences  which  are  going  to  be  important  in  any  learning  or 
estimation  scenario. 

Additional  results  have  been  written  up  in  [51]  for  which  the  limit  regime  is  in  the 
so-called  'thermodynamic'  regime  (which  includes  the  percolation  threshold)  in  which 
the  complexes  become  very  large  and  complicated,  with  complex  homology 
characterised  by  diverging  Betti  numbers.  The  proofs  combine  probabilistic  arguments 
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from  the  theory  of  stabilizing  functionals  of  point  processes  and  topological  arguments 
exploiting  the  properties  of  Mayer-Vietoris  sequences.  The  Mayer-  Vietoris  arguments 
are  crucial,  since  homology  in  general,  and  Betti  numbers  in  particular,  are  global  rather 
than  local  phenomena,  and  most  standard  probabilistic  arguments  are  based  on  the 
additivity  of  functionals  arising  as  a  consequence  of  locality.  These  results  are  closely 
related  to  the  ideas  about  the  role  of  topological  testability  and  are  one  of  the  main  future 
directions  of  this  work. 

A  related  problem  on  which  Adler,  Bobrowski  and  Weinberger  have  made 
considerable  progress  [4]  is  a  phenomenon  that  we  have  named  'crackle'.  Once  again, 
random  Cech  complexes  are  created,  this  time  with  fixed  inter-point  distances  and  based 
over  different  types  of  samples.  It  was  shown  that  if  the  additional  noise  is  in  some  sense 
large  then  sample  points  can  appear  basically  anywhere,  introducing  extraneous 
homology  elements.  We  observe  that  Gaussian  noise  does  not  crackle  (which  explains 
why  topological  methods  have  been  of  most  use  in  that  sample)  but  exponential  and 
scale-free  has  a  lot  of  crackle. 

A  family  of  results,  including  a  law  of  large  numbers,  will  appear  in  a  joint  work  of  all 
the  Pi’s  and  co-PI’s.  Among  these  is  the  result  that,  for  a  sample  of  n  exponential 
variables,  the  expected  number  of  bars  in  the  zero-th  order  persistence  homology  of 
length  in  the  interval  [x,y]  tends,  as  n  tends  to  infinity,  to 


je  “(1-e  “)  ^  l[2x,2y](u)  du. 


The  blow  up  near  the  origin  is  results  from  the  fact  that,  with  many  points,  there  are  very 
many  very  short  persistence  intervals.  The  quadratic  nature  of  the  divergence  is  perhaps 
typical.  Similar  things  occur  for  persistence  intervals  in  the  level  set  filtrations  of 
Brownian  motion.  Overall,  the  nonzero  measure  associated  to  long  intervals  is  essentially 
a  precise  formulation  of  the  crackle  phenomenon. 

In  a  subsequent  paper  [17]  far  more  precise  results  are  established.  There,  point 
process  convergence  of  spherically  symmetric  k-tuples  (Xii  ,...,Xik)  (R‘*)  is  studied 

under  certain  geometric  constraints.  If  the  law  of  the  random  points  in  has  either 
regularly  varying  or  exponentially  decaying  tails  that  vanish  slowly  enough,  then  a 
certain  Poisson  random  measure  becomes  the  weak  limit  of  the  point  process.  On  the 
contrary,  if  the  law  of  the  random  points  has  rapidly  decaying  exponential  tails,  the 
corresponding  point  process  tends  to  zero  in  probability.  As  an  application,  the  homology 
of  the  Cech  complex  built  over  those  random  points  is  studied.  The  weak  convergence 
result  shows  that  Betti  numbers  of  order  up  to  d  -1  have  either  Poisson  limits  or  are 
degenerate,  depending  upon  how  heavy  are  the  tails  of  the  distributions  of  the  random 
points. 

As  this  paper  was  being  written  up  it  became  clear  that  although  the  work  was 
originally  motivated  by,  and  has  immediate  applications  to,  topological  data  analysis,  the 
main  results  in  some  sense  “belong”  to  the  classical  area  of  extreme  value  theory  (EVT). 
Consequently,  the  paper  was  written  in  the  language  of  EVT,  for  two  reasons.  The  first 
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was  one  of  convenience  -  most  of  the  natural  notation  and  a  lot  of  the  needed  pre-existing 
lemmas  came  from  there.  The  second  was  more  long  term,  in  that  the  authors  wanted  to 
introduce  to  the  EVT  community,  in  a  language  with  which  they  are  familiar,  the  general 
area  of  stochastic  algebraic  topology. 

Despite  this,  the  paper  concludes  in  the  language  of  applied  topology,  and  allows 
the  topology  literate  reader  to  understand  the  implications  of  EVA  type  analysis  to 
applied  topology. 

Owada  has  taken  this  further  in  subsequent  papers  [32,33].  While  still  motivated 
by  the  issue  of  crackle  in  TDA,  and  thus  interested  in  the  behavior  of  Betti  numbers  and 
other  topological  aspects  of  Cech  complexes,  the  approach  taken  in  this  work  is  to 
investigate  the  limiting  behavior  of  a  sub-graph  counting  process,  when  the  graph  in 
question  is  the  1 -skeleton  of  the  complex.  In  particular,  the  subgraph  counting  process  at 
the  core  of  the  paper  counts  the  number  of  subgraphs  having  a  specific  shape  that  exist 
outside  an  expanding  ball  as  the  sample  size  increases.  As  an  underlying  law,  the  paper 
considers  distributions  with  a  regularly  varying  tail  and  those  with  an  exponentially 
decaying  tail. 

The  aim  is  then  to  obtain  functional  limit  theorems  for  these  processes  as  the 
underlying  scale  parameter  of  the  Cech  complex  changes.  This  is,  of  course,  a  much  more 
sophisticated  result  than  a  standard  (central)  limit  theorem,  and  is,  to  the  best  of  our 
knowledge,  the  first  time  that  a  functional  limit  theorem  has  been  proven  in  the  setting  of 
random  topology.  Regarding  the  specific  results,  it  is  seen  that  the  nature  of  the 
functional  central  limit  theorem  differs  according  to  the  speed  at  which  the  ball  expands. 
In  fact,  the  proper  normalizations  for  the  limit  theorems  and  the  properties  of  limiting 
Gaussian  processes  are  all  determined  by  whether  or  not  an  expanding  ball  covers  a 
region  -  called  a  weak  core  -  in  which  the  random  points  are  densely  scattered  and  form  a 
giant  geometric  graph. 

The  results  of  this  work  not  only  have  implications  for  increased  understanding  of 
the  structure  of  persistent  homology  under  crackle  -  an  issue  of  applied  relevance  -  but 
have  significant  intrinsic  interest.  In  particular,  the  limiting  stochastic  processes  that 
appear  here  seem  to  be  completely  new  in  the  context  of  probability  theory. 

3:  Random  manifolds  and  random  embeddings 

In  the  initial  proposal  we  noted  that  random  manifolds  arise  in  a  number  of 
scenarios,  and  that  one  of  the  key  geometric  quantities  that  arises  there  in  recovering  the 
homology  of  a  manifold  M  c  R"  by  randomly  sampling  points  from  it  is  the  critical 
radius  x  of  the  manifold. 

Roughly  speaking,  the  reach,  or  critical  radius,  of  a  manifold  is  a  measure  of  its 
departure  from  convexity  that  incorporates  both  local  curvature  and  global  topology.  It 
plays  a  major  role  in  many  aspects  of  differential  geometry,  and  more  recently  has  turned 
out  to  be  a  crucial  parameter  in  assessing  the  efficiency  of  algorithms  for  manifold 
learning. 

As  the  critical  radius  depends  on  the  embedding  it  is  of  interest  to  study  the 
behavior  of  the  critical  radius  of  a  Riemannian  manifold  (M;  g)  for  a  generic,  or  random 
embedding  of  M  into  R"  for  large  n.  A  natural  model  to  consider  is  based  on  taking 
independent,  identically  distributed,  copies,  fi,...,  of  a  real-valued  random  field  on  M 
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and  then  working  with  f  =  (fi,...,fn)  :  M  ^  R"  to  define  the  random,  embedded  manifold 
f(M).  Eaeh  sueh  random  embedding  gives  rise  to  a  random  Riemannian  metrie  on  M,  that 
is  naturally  related  to  the  original  metrie  g.  Other  geometrie  features  of  interest  inelude 
the  study  of  the  geometrie  invariants  of  such  Riemannian  metrics,  such  a  volume, 
curvature,  diameter,  etc. 

Together  with  a  Technion  graduate  student,  Sreekar  Ram  Krishnan,  Adler,  Taylor 
and  Weinberger  have  shown  [7]  that  the  self-normalised  critical  radii  of  these  randomly 
embedded  manifolds  converges  almost  surely  to  a  deterministic  limit  determined  by  the 
structure  of  the  underlying  manifold  M  and  the  covariance  function  of  the  process 

Somewhat  unexpectedly,  this  limit  turns  out  to  be  the  same  one  that  arises  in 
studying  the  exceedance  probabilities  of  the  Gaussian  process  over  the  manifold. 

En  passant  it  was  also  proven  that  the  induced  embeddings  are  asymptotically 
isometric,  from  which  it  follows  that  other  properties  of  the  embedded  manifolds,  such  as 
volume,  curvature  integrals,  etc,  also  converge.  Elnderlying  this  there  turns  out  to  be  a 
much  deeper  notion  of  convergence.  [23]  proves  such  results. 

This  collection  of  theorems  results  in  an  important  philosophical  implication,  which 
is  encouraging  for  topological  data  analysis.  Although  the  sample  complexity  of 
learning  a  manifold  grows  (exponentially)  with  ambient  dimension  (see  [47]),  even  with 
lower  bounds  on  the  critical  radius  and  upper  bound  on  diameter,  for  a  given  “platonic 
ideal”,  the  random  embeddings  do  not  suffer  this  defect,  and  generically  the  image 
manifold  can  be  learned  with  a  sample  complexity  that  does  not  grow  with  dimension 
even  in  the  presence  of  (controlled)  noise. 

4,  Other  directions  that  have  grown  out  of  this  work, 

(i)  The  lower  complexity  bounds  in  the  problems,  established,  in  general,  in  [47] 
pose  an  important  issue  for  TDA.  Many  of  the  usual  questions  people  ask  are  unfeasible 
in  general:  computation  of  invariants  is  too  difficult,  the  number  of  topological  types  is 
too  large.  Applications  of  topological  methods  must  either  explain  why  the  data  should 
be  suitable  for  those  methods  -  e.g.  why  the  complications  that  could  arise,  do  not  (as  in 
the  work  [7]  mentioned  in  the  previous  paragraph  NOT  MENTIONED  THERE)  -  or  they 
should  be  focused  on  invariants  that  can  be  measured.  Weinberger  has  been  studying 
such  invariants,  modeled  on  testability  properties  of  graph  properties.  The  simplest  of 
these  is  the  Euler  characteristic  divided  by  the  volume  -  which  is  essentially  (for 
Riemannian  manifolds)  the  average  (Pfaffian  of  the)  curvature.  As  an  average,  it  is 
subject  to  sampling.  Thus,  a  large  submanifold  in  Euclidean  space  whose  average 
curvature  is  large  will  surely  have  complicated  topology,  and  discovering  its  topological 
properties  will  require  enormous  sampling  and  computational  resources. 

Similarly,  characteristic  0  Betti  numbers  seem  to  be  testable  (but  not  too 
straightforwardly:  random  regular  graphs  have  high  Euler  characteristic  and  first  Betti 
number:  but  randomly  they  look  like  trees  that  have  no  local  topology!  (The  fact  that  the 
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ratio  can  be  approximated  via  finite  samples  is  a  consequenee  of  Hodge  theory). 

Whether  this  is  the  ease  for  mod  p  Betti  numbers  is  an  important  problem. 

Weinberger  has  shown  that  for  “geometric  complexes”  i.e.  those  whieh  embed  in 
Euelidean  space  with  bounded  geometry,  the  phenomenon  of  “foamless  foam”,  i.e.  of  the 
presence  of  large  homology  without  any  homology  being  present  at  a  small  seale,  does 
not  oeeur.  This  suggests  that  better  algorithms  for  eomputing  sueh  invariants  might  be 
possible  for  the  geometrie  case.  These  results  are  consistent  with  the  results  of  [51]  on 
thermodynamie  limits.  Thermodynamic  limits  also  arise  in  the  dual  question;  throwing 
away  the  many  visible  small  eyeles  and  look  for  the  birth  of  the  large,  maeroseopie 
eyeles.  Bobrowski  and  Weinberger  have  been  studying  this  and  its  eonneetion  to 
pereolation  theory.  These  works  are  not  yet  eomplete. 


(ii)  Another  “spin  off’  is  the  paper  [20]  whieh  deals  with  the  question  of  whether 
there  are  manifolds  that  ean  be  made  arbitrarily  elose  to  one  another  in  Gromov- 
Hausdorff  spaee  with  a  loeal  eontraetibility  function.  Any  such  manifolds  must  be 
homotopy  equivalent  and  must  have  the  same  rational  oharaeteristie  elasses.  (The  first  is 
easy;  the  seeond  is  a  deep  theorem  of  S.  Ferry.) 

However,  we  show  that  there  are  indistinguishable  manifolds,  and  even  some 
infinite  sets  of  sueh  manifolds.  We  also  show  that  for  “reasonable  fundamental  groups” 
the  set  of  doppelgangers  of  a  given  manifold  is  finite.  However,  there  are  some. 

In  revised  versions  of  this  paper,  the  eonneetion  between  this  topologieal  problem 
and  analytie  methods  based  on  C*  algebras  has  been  strengthened,  resulting  in  the  paper 
being  rewritten  to  take  this  into  aeeount.  As  a  eonsequenee  the  theory  is  now  essentially 
eomplete  for  many  fundamental  groups  (ineluding  abelian,  and  torsion  free  linear 
groups). 

(iii)  Taylor  and  several  eo-authors  have  been  studying  inferential  problems  in  statisties 
and  machine  learning  related  to  eritieal  points  of  eommon  objeetive  functions 
encountered  in  maehine  learning.  A  eanonieal  example  of  sueh  an  objective  funetion 
would  the  LASSO  (squared  error  plus  an  L'  penalty).  The  solution  to  this  problem  is  a 
eritieal  point,  and  many  of  the  tools  developed  in  the  theory  of  smooth  random  fields  on 
pieeewise  smooth  spaees  developed  by  Adler  and  Taylor  are  applieable  to  sueh  problems. 

Taylor  and  several  eo-authors  have  continued  work  on  seleetive  inference 
reported  in  SATA's  2014  annual  report.  The  main  methodologieal  eontribution  [21] 
deseribes  a  formal  approaeh  to  inferenee  after  model  seleetion  where  model  seleetion  is 
broadly  deseribed  as  observing  partial  information  about  the  entire  sample.  Previously 
reported  work  [25]  applied  an  early  version  of  this  framework  to  inferenee  after  seleeting 
features  using  the  LASSO. 

One  of  the  key  eonstruetions  in  [13]  is  the  idea  of  performing  inferenee  under  a 
selected  model  as  opposed  to  inference  for  different  parameters  of  the  same  model  as  in 
[14].  This  teehnieal  distinetion  allows  for  valid  inference  in  regression  models  with 
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unknown  variance  [27].  Other  work  in  this  vein  include  a  version  of  seleetive  inference  in 
which  the  sample  is  randomized  before  a  model  selection  algorithm  is  applied  [40].  These 
algorithms  are  similar  to  those  appearing  in  the  differential  privacy  literature  (c.f.  “The 
reusable  holdout:  preserving  validity  in  adaptive  data  analysis",  Scienee,  2015).  The 
intervals  and  hypothesis  tests  in  [40]  are  less  eonservative  than  the  differential  privacy 
approach. 

Work  is  continuing  on  applying  this  approach  to  model  selection  from  a 
continuum  of  models  as  in  [19]  in  whieh  a  seleetive  inference  algorithm  is  proposed  for  a 
sequential  algorithm  to  determine  the  rank  of  spiked  covariance  model  in  PCA.  The 
proposed  algorithm  is  less  conservative  than  asymptotic  approaches  based  on  RMT. 
Current  work  is  focused  on  applying  these  techniques  to  CCA  (canonical  correlations 
analysis)  and  relating  our  finite  sample  size  algorithms  to  the  asymptotic  RMT 
approaches.  The  eomplete  development  of  these  ideas  appear  in  the  papers 
[12,19,21,24,25,26,27,28,  34,  35,36,37,  38,  40,  41] 

[37]  builds  on  the  exact  selective  inferenee  of  [25]  in  the  Gaussian  least  squares  setting  to 
the  setting  of  general  likelihoods  with  a  LASSO  penalty.  It  deseribes  how  to  remove  the 
parametric  modelling  assumption  for  the  covariance,  using  a  bootstrap  estimate  of 
covariance,  removing  the  assumption  that  the  selected  model  is  correct. 

The  model  in  [26]  considers  selective  inferenee  in  regression  problems  where  features  are 
clustered  and  one  uses  a  prototype  to  represent  the  entire  eluster  in  a  regression  model. 
They  derive  an  analog  of  the  F-test  for  the  entire  cluster  given  its  prototype  was  selected 
in  a  model  seleetion  procedure  like  the  LASSO. 

[41]  Building  on  the  framework  of  selective  inference  after  randomization  in  [37],  we 
describe  a  simple  randomization  scheme  that  yields  an  explicit  formula  for  the  seleetive 
likelihood  ratio  which  is  necessary  for  seleetive  inference.  The  construetion  relies  on  an 
exaet  inversion  of  the  KKT  eonditions  of  a  partieular  randomized  LASSO  problem.  Co¬ 
author  Nan  Bi  was  supported  by  AFOSR  in  carrying  out  this  work. 

[54]  This  paper  extends  the  approaeh  of  [41]  to  fairly  arbitrary  convex  programs. 

Notably,  penalties  with  some  eurvature  sueh  as  the  group  LASSO  ean  easily  be  handled 
in  this  fashion,  as  well  as  multiple  steps  forward  stepwise  and  \top  K"  marginal 
screening.  For  penalties  with  curvature,  the  change-of-measure  formula  involves  a 
Jaeobian  encoding  similar  geometric  structure  to  the  Jaeobian  in  Steiner-Weyl  volume- 
of-tubes  formulae.  [34]  This  paper  eonsiders  selective  inferenee  in  a  Bayesian  eontext, 
building  on  an  approach  for  univariate  problems  in  [49].  The  main  technical  difficulty  in 
this  work  is  computing  the  selection  probability  as  a  function  of  the  parameter  on  which  a 
prior  is  specified.  We  use  a  large-deviations  approximation  to  this  probability  that 
involves  solving  a  well-defined  eonvex  program  for  eaeh  step  of  the  Markov  Chain.  This 
program  possesses  nice  structure,  particularly  if  selection  is  randomized.  [21]  produces 
a  sequential  model  selection  algorithm  that  satisfies  the  hypotheses  of  the  sequential 
FDR-controlling  procedure  of  [22].  We  demonstrate  an  improvement  in  power  over  the 
spacings  test  of  [41]  whose  tests  also  fail  to  satisfy  the  hypotheses  required  for  eontrol  of 
FDR. 
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(iv)  Baryshnikov  and  Mileyko  have  been  studying  problems  related  to  many  of  the 
above,  but  on  networks.  The  persistence  homology  of  networks  (with  analytic  growth 
bounds,  with  respect  to  steadily  increasing  scale)  has  been  related  to  network  flow  and 
congestion  problems  as  well  as  to  defining  dimensions  of  networks"*.  Baryshnikov  and  his 
postdoc  Yuri  Mileyko  continued  the  studies  of  the  (local)  dimensions  of  synthetic  and 
real-life  networks.  The  background  for  this  quest  was  an  emerging  trend  in  networking 
community  to  view  networks  as  hyperbolic  in  some  sense  (e.g.  as  being  CAT(O)  spaces, 
or  Gromov-hyperbolic,  etc).  One  thread  in  this  area  of  research  had  as  an  underlying 
premise  that  the  real-life  networks  can  be  properly  modeled  by  random  geometric  graphs 
sampled  from  a  ball  in  homogeneous  hyperbolic  spaces,  or,  even  more  specifically,  from 
the  hyperbolic  plane.  While  the  pictorial  representations  in  the  numerous  publications  in 
this  spirit  looked  convincing,  the  basic  questions  were  not  asked,  i.e.  why  the  hyperbolic 
plane?  Why  the  assumptions  of  homogeneity?  etc. 

In  general,  the  random  finite  metric  space  obtained  by  a  dense  enough  sampling  from  a 
Riemannian  manifold  would  provide  enough  data  to  detect  at  least  the  dimension  of  the 
underlying  manifold;  if  X  c  M  is  a  finite  sample  from  M,  a  manifold  of  dimension  m, 
then  for  spherical  shells  of  points  in  X,  and  judiciously  chosen  radii  R,  r  (R  much  less 
than  the  injectivity  radius,  r  large  to  ensure  dense  sampling),  the  persisting  homology  of 
the  Rips  complex  of  SX(x;  R;  r)  =  H(X,  X-x;  R,r)  should  be  concentrated  in  dimensions  0 
and  m,  for  interior  points  of  the  sample,  and  just  in  dimension  0  for  the  points  near  the 
boundary. 

Experiments  confirmed  that  this  is  exactly  what  happens  for  the  samples  from  hyperbolic 
plane.  However,  contrary  to  what  one  might  expect  from  the  existing  literature,  the 
analysis  of  the  ASN  network  (the  world-visible  structure  of  the  autonomous  domains, 
roughly  the  network  of  Internet  connections)  shows  that  their  local  homologies  are 
extremely  wild  and  irregular,  and  are  nowhere  close  to  the  sample  from  the  hyperbolic 
plane  (or  any  manifold).  On  the  positive  side,  the  local  homology  is  yet  another 
characteristic  of  the  nodes  in  large  graphs,  and  we  plan  to  use  it  systematically  for 
network  analysis  (and,  perhaps,  to  analyze  samples  from  singular  spaces,  in  a  TDA 
fashion). 

The  results  of  these  experiments  are  available  at  the  web  site 
http://publish.ilhnois.edu/vmb/2014/09/21/dimension-of-the-intemet/  and  show  how  the 
local  homologies  behave  for  samples  of  the  hyperbolic  plane  and  for  the  “internet  graph”. 

Other  applications  of  these  methods  large  networks  obtained  from  sampling  large 
geometric  function  spaces  will  appear  in  the  revised  version  of  [16] 

(v)  S.  Mukherjee  and  Katharine  Turner  have  developed  a  persistent  homology 
transform  that  has  application  to  shape  statistics  -  describing  a  shape  by  a  “Radon 
transform””  of  the  persistent  homology  of  the  height  functions  in  all  the  different 
directions,  and  applied  this,  together  with  D.  Boyer  in  the  Evolutionary  Anthropology 
Department  at  Duke  University,  to  data  comparing  calcanei  bones  of  various  primates 


See  the  paper  of  Block  and  Weinberger  on  Earge  Scale  Homology  theories  of 
Metric  spaces  and  Baryshnikov,  Bonahon,  Jonckheere  and  Lou,  on  Euclidean  versus 
Hyperbolic  congestion  for  some  background,  all  available  on  the  authors’  web  pages. 
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[33].  Turner  and  V.  Robins  have  been  studying  persistenee  homology  invariants  of  sand 
and  other  disordered  materials  [34],  This,  too,  has  strong  connections  with  theory  of 
testable  invariants  mentioned  above. 

(vi)  Arnold,  Baryshnikov  and  Mileyko’s  paper  [25]  studies  the  typical  shapes  of  the 
(discretized)  loops  sampled  uniformly  from  the  space  of  loops  in  a  given  (free)  homotopy 
class  on  a  surface.  It  is  shown  that  there  exists  a  large  deviation  principle  forcing  the 
sampled  loop  to  be  close  to  the  solution  of  a  variational  problem.  (For  example,  in  the 
case  of  plane  with  punctures,  to  a  collection  of  straight  segments  representing  the 
minimal  loop  in  the  given  homotopy  class.)  We  note  that  this  result  is  in  tension  with  the 
large  amount  of  time  it  can  take  loops  that  are  near  to  an  index  zero  geodesic  that  is  not 
actually  length  minimizing.  A  given  homotopy  class  can  have  infinitely  many  such 
geodesics  even  for  a  bumpy  metric  on  S^.  Nevertheless,  asserts  [25]  in  that  case,  “almost 
all”  geodesics  will  be  “pointlike”. 

(vii)  In  [9],  Baryshnikov  studies  problems  related  to  tiling  spaces  —  a  topic  closely  related 
to  the  mathematical  physics  of  testability  and  to  the  problem  of  defining  invariants  that 
can  be  computed  quickly.  The  classical  Wang  (2D)-tilability  problem  asks  whether  one 
can  tile  the  plane  using  a  collection  of  domino  tiles  (with  marked  boundaries,  under  the 
matching  boundaries  constraints).  Motivated  by  some  questions  from  Markov  Random 
Fields,  we  investigate  same  problem  under  constraints  on  the  (asymptotic)  frequencies  of 
tiles  of  each  type.  There  are  some  natural  conditions  coming  from  matching  the  boundary 
frequencies,  but,  as  it  turned  out,  they  are  not  sufficient.  We  prove  that  the  realizable 
frequencies  form  a  convex  proper  subset  of  the  polyhedron  of  feasible  frequencies.  In  a 
sequel  (finalized  now  with  Abram  Magner  and  Spankowski)  we  ask  for  more  general 
question:  what  is  the  average  asymptotic  genus  of  a  2D  surface  with  a  free  Z2  action 
admitting  a  tiling  with  given  frequencies  of  tiles. 

(viii)  We  conclude  with  discussing  some  more  engineering  applications  of  topological 
ideas.  These  were  done  by  Baryshnikov  and  collaborators. 

In  the  three  papers  [II,  13,  46]  the  authors  analyze  the  topology  of  the  configuration 
space  from  the  viewpoint  of  complexity  of  any  feedback  control  stabilizing  the  trajectory 
under  the  (stochastic  or  not)  perturbations  on  an  attractor.  The  topology  of  the 
configuration  space  is  critical  for  this  structure  of  the  feedback  control  loop.  In  the  first 
paper,  we  look  at  the  configuration  space  of  multi-legged  robotic  device  that  turned  out  to 
be  related  to  moment-angle  complexes  and  Stanley-Reisner  rings. 

It  is  well-known  that  randomly  switching  between  multiplications  by  several 
operators  can  lead  to  divergent  dynamics,  even  if  each  operator  in  the  family  is 
asymptotically  contracting.  Some  algebraic  conditions  (such  as  solvability  of  the  Lie 
algebra  generated  by  the  operators)  prevent  such  anomalies.  Solvability  is  not  an  open 
property,  motivating  the  study  [15]  ,  that  proves  that  slightly  relaxing  the  solvability 
condition  keeps  the  switched  systems  stable. 
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Training  of  Graduate  and  Postdoctoral  Fellows 

Omer  Bobrowski,  a  student  of  Adler,  was  partly  trained  on  this  grant.  After  a 
postdoe  at  Duke,  has  taken  a  tenure-traek  position  at  the  Teehnion. 

Weinberger’s  student,  Katharine  Turner,  got  her  Ph.D.  and  is  now  a  postdoe  at 
EFPL  (Lausanne). 

Eliran  Subag  did  his  masters  at  the  Teehnion,  and  moved  to  Weizmann  for  Ph.D. 
studies,  and  will  be  reeeiving  his  PhD  in  2017. 

Gregory  Naitzat  moved  to  University  of  Chieago  (Statistics)  from  the  Teehnion. 
Yogeshwaran  Dhandapani,  who  was  a  postdoc  at  Teehnion  moved  to  the  Indian 
Statistical  Institute  in  Bangalore. 

Sunder  Ram  Krishnan  is  continuing  with  his  PhD  and  should  finish  in  2017. 

A  recent  Stanford  PhD,  Michael  Lesnick,  visited  the  Teehnion  for  3  months, 
before  taking  up  a  postdoctoral  position  at  Princeton  (to  work  with  MacPherson).  He  is 
now  at  the  Princeton  Neuroscience  Institute. 

Han  Wang,  a  PhD  student  at  UIUC,  defended  PhD  and  is  now  a  postdoc  at  NCSU. 

Harish  Chintakunta,a  postdoc  of  Baryshnikov’s  is  now  at  Florida  Polytechnic. 

Yuriy  Mileyko,  another  of  Baryshnikov’s  postdocs  moved  to  University  of 
Hawaii. 


Selected  talks  and  conference  organization. 

In  January  2012,  AMS  Short  Course  on  Random  Fields  and  Random 
Geometry,  was  organized  by  Adler  and  Taylor  at  the  AMS  Annual  Meeting, 
Boston. 

Adler  coordinated  a  tutorial  on  An  Introduction  to  Statistics  and  Probability  for 
Topologists  at  the  IMA  in  October  2013,  as  well  as  being  one  of  the  organizers  of  a 
workshop  on  Topological  Data  Anaysis  which  followed  the  tutorial  session.  In  February 

2014,  he  coorganized  the  SAMSI  workshop  onLDFID:  Topological  Data  Analysis. 

In  April,  2015,  Adler  gave  the  Annual  de  Rahm  Lecture,  at  EPFL,  Lausanne, 
Switzerland.  (Phase  Transitions  and  Random  Topology.) 

Adler  also  was  a  member  of  the  Scientific  Committee  of  the  June  2015  meeting 
DyToComp  (Dynamics,  Topology  and  Computations),  in  Bedlewo,  Poland.  In  August 

2015,  at  the  Stochastic  Geometry  Workshop,  in  Poitiers,  France,  he  spoke  on  Topological 
Phase  Transitions  and  also  gave  a  course  on  applied  topology.  In  September  2015,  he 
spoke  at  the  Heilbronn  Annual  Conference,  Bristol,  UK. 

Adler  and  Taylor  coorganized  the  October  2013,  IMA  Tutorial;  An  Introduction 
to  Statistics  and  Probability  for  Topologists,  and  the  October  2013,  IMA  Workshop  on 
Topological  Data  Analysis.  Co-Organiser.  Weinberger  spoke  at  this  meeting. 

Adler  co-organized  February  2014;  SAMSI  workshop  on  Low  Dimensional 
Structure  in  High  Dimensional  Systems;  Workshop  on  Topological  Data  Analysis. 

Adler  was  a  member  of  the  scientific  committee  of  Extreme  Value  Analysis,  EVA15, 
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Ann  Arbor,  Michigan.  Adler  spoke  in  November  2014  at  the  Workshop  on  Diserete, 
Computational  and  Algebraic  Topology,  Copenhagen.  (Pondering  Persistenee  and 
Extolling  Euler.)  Einally,  in  Summer/Eall  2016,  Adler  was  a  member  of 
the  scientific  committee  of  the  Thematic  Semester  on  Probabilistic 
Methods  in  Geometry,  Topology,  and  Spectral  Theory.  Centre  de  Recherches 
Math'ematiques,  Montreal. 

Baryshnikov  and  Weinberger  gave  plenary  talks  at  ATMCS  6  (British  Columbia) 
in  May  2014. 

Baryshnikov  organized  a  speeial  semester  on  applied  algebraie  topology  at 
ICERM  in  Eall  2016. 

Baryshnikov  presented  some  of  the  results  on  random  networks  at  NIST-Bell 
Eabs  workshop  on  Geometry  of  Networks,  at  NIST,  the  MCA  speeial  session  on  Applied 
Algebraie  Topology,  and  a  plenary  talk  at  the  SIAM  eonferenee  on  Applied  Algebraie 
Geometry 

Baryshnikov,  Taylor  and  Weinberger  all  spoke  at  Stoehastie  Proeesses  and 
Random  fields:  Geometry  and  fine  properties,  at  Teehnion,  June  2015 

Taylor  spoke  at  Statistieal  Inferenee  for  Earge  Seale  Data,  Simon  Eraser,  April 

2015 

Taylor  spoke  at  a  speeial  Topologieal  Data  Analysis  workshop  at  NIPS  in 
Deeember  2012.  He  was  an  invited  speaker  at  the  European  Meeting  of  Statistieians  and 
partieipated  at  the  IMA  workshop  in  October  2013. 

Taylor  and  his  eollaborators  gave  several  talks  at  the  Joint  Statistieal  Meetings  in 
Boston  in  August  2014. 

Taylor  also  gave  the  Berkeley-Stanford  Colloquium,  Berkeley,  April  2015  and  at 
JSM  2015  he  organized  an  ISM  invited  session  on  Post-Seleetion  Inferenee"  at  JSM20I5. 
One  of  speakers  was  student  Joshua  Eoftus,  speaking  on  their  joint  work.  He  (Talyor) 
also  gave  invited  talk  in  the  session  on  “Modem  Inferential  Methods  for  Big  Data 
Analysis". 

Weinberger  gave  a  plenary  talk  at  the  Applied  Algebraie  Topology  meeting  in 
Bedlewo.  He  gave  the  “frontiers  of  Mathematies”  leeture  series  at  Texas  A&M;  one  of 
the  leetures  featuring  ideas  related  to  property  testing  and  its  eonneetions  to  both  pure  and 
applied  problems.  He  leetured  three  times  at  IMA  during  2013-14,  and  visited  ICERM 
four  times  in  fall  2016. 

Weinberger  organized  a  eonferenee  “Geometrie  Methods  in  Data  Analysis”  in 
May  2015  at  the  Stevanovieh  Center  for  finaneial  Mathematies  (Chieago).  Adler  and 
Baryshnikov  were  invited  speakers.  At  June,  2015,  DyToComp  (Dynamies,  Topology 
and  Computations),  Bedlewo,  Poland.  Adler  was  a  member  of  the  Seientifie  Committee. 
Weinberger  was  an  invited  speaker. 

Weinberger  spoke  at  the  joint  lAS-Penn-Rutgers  seminar  on  applied  topology  and 
gave  a  eolloquium  at  Yale  (on  quantitative  topology,  whieh  is  a  theme  that  overlaps  this 
projeet).  He  also  gave  a  leeture  in  the  Simons  Seienee  Series  at  the  Simons  foundation  in 
New  York.  Both  of  these  will  took  plaee  in  November  2014.  In  Eebmary,  he  gave  the 
applied  math  eolloquium  at  Stanford. 

Weinberger  organized  a  eonferenee  “Geometrie  Methods  in  Data  Analysis”  in 
May  2015  at  the  Stevanovieh  Center  for  Einaneial  Mathematies  (Chieago).  Baryshnikov 
and  Bobrowski  both  were  among  the  invited  speakers. 
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Finally,  we  are  very  happy  to  report  on  Adler’s  four  pieees  for  the  IMS, 
introdueing  the  ideas  of  applied  topology  to  a  very  broad  audienee  of  statistieians. 

Honors  and  awards, 

Adler  was  invited  to  give  a  plenary  Speeial  Invited  Leeture  at  the  European 
Meeting  of  Statistieians  in  Budapest,  July  2013  and  was  awarded  a  prestigious  European 
Researeh  Couneil  Advanced  grant.  He  was  awarded  the  2014  Henry  Taub  Prize  for 
Academic  Excellence  at  Technion,  and  gave  the  2015  de  Rham  lecture  of  the  Swiss 
Doctoral  Programme,  and  the  2015  Heilbronn  lecture.  Einally,  he  was  invited  to  give  a 
Plenary  lecture  at  the  2016  British  Mathematical  Colloquium. 

Jonathan  Taylor  gave  an  invited  lecture  at  the  Bernoulli  World  Congress  2016, 
Toronto,  Canada.  2016  and  the  Scandinavian  Journal  of  Statistics  invited  talk:  Selective 
inference  in  regression.  At  NORDSTAT  2016,  Copenhagen,  Denmark.  2016. 

Shmuel  Weinberger  was  inducted  in  2012  into  the  inaugural  class  of  Eellows  of 
the  American  Mathematical  Society.  In  2013  he  became  a  Eellow  of  the  American 
Association  for  the  Advancement  of  Science.  In  2015,  Weinberger  was  appointed  the 
Andrew  MacEeish  Distinguished  Professor  of  Mathematics  at  the  University  of  Chicago. 
He  gave  the  Erontiers  in  Mathematics  lecture  series  at  Texas  A&M  in  2013,  the  MINT 
Distinguished  lectures  at  Tel  Aviv  University  in  November  2015,  and  was  invited  to  give 
the  2017  Minerva  lectures  at  Princeton  University,  an  invited  lecture  at  the  2017 
Mathematical  Congress  of  the  Americas  and  a  plenary  lecture  at  the  tri-annual  meeting  of 
EoCM  in  Madrid. 

The  graduate  student.  Turner  received  the  2013  Stevanovich  Center  for  Einancial 
Math  Eellowship  for  her  work  on  the  Persistent  Homology  transform  and  its  application 
to  evolutionary  biology  data. 
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